Tile size selection using cache organization and data layout
نویسندگان
چکیده
منابع مشابه
Adaptive Models for Tile Size Selection
Tiling (or blocking) is widely used to exploit data locality and coarse-grained parallelism. Tile sizes significantly influence the performance and several models have been proposed for tile size selection. However, with advances in hardware and compiler optimizations, previous models are no longer effective. Developing efficient models each time the hardware or compiler changes require extensi...
متن کاملAnalytical Bounds for Optimal Tile Size Selection
In this paper, we introduce a novel approach to guide tile size selection by employing analytical models to limit empirical search within a subspace of the full search space. Two analytical models are used together: 1) an existing conservative model, based on the data footprint of a tile, which ignores intra-tile cache block replacement, and 2) an aggressive new model that assumes optimal cache...
متن کاملNeural Network Assisted Tile Size Selection
Abstract. Data locality optimization plays a significant role in reducing the execution time of many loop-intensive kernels. Loop tiling at various levels is often used to effectively exploit data locality in deep memory hierarchies. The recent development of frameworks for parametric loop tiling of user code has lead to a widening of the range of applications that could benefit from auto-tunin...
متن کاملUsing Application Bisection Bandwidth to Guide Tile Size Selection for the Synchroscalar Tile-Based Architecture
This paper investigates the impact of proper tile size selection on the power the power consumption for tile-based processors. We refer to this investigation as a tile granularity study. This is accomplished by distilling the architectural cost of tiles with different computational widths into a system metric we call the Granularity Indicator (GI). The GI is then compared against the bisection ...
متن کاملModel-Driven Tile Size Selection for DOACROSS Loops on GPUs
DOALL loops are tiled to exploit DOALL parallelism and data locality on GPUs. In contrast, due to loop-carried dependences, DOACROSS loops must be skewed first in order to make tiling legal and exploit wavefront parallelism across the tiles and within a tile. Thus, tile size selection, which is performance-critical, becomes more complex for DOACROSS loops than DOALL loops on GPUs. This paper pr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM SIGPLAN Notices
سال: 1995
ISSN: 0362-1340,1558-1160
DOI: 10.1145/223428.207162